{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using Neo4j's graph database with AG2 agents for Question & Answering\n",
    "\n",
    "AG2 provides GraphRAG integration through agent capabilities. This is an example utilising the integration of Neo4j's property graph database with LlamaIndex's graph query engine.\n",
    "\n",
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
    "To install the LlamaIndex, Neo4j, and document processing dependencies, install with the 'neo4j' extra:\n",
    "\n",
    "```bash\n",
    "pip install -U ag2[openai,neo4j]\n",
    "```\n",
    "\n",
    "> **Note:** If you have been using `autogen` or `ag2`, all you need to do is upgrade it using:  \n",
    "> ```bash\n",
    "> pip install -U autogen[openai,neo4j]\n",
    "> ```\n",
    "> or  \n",
    "> ```bash\n",
    "> pip install -U ag2[openai,neo4j]\n",
    "> ```\n",
    "> as `autogen`, and `ag2` are aliases for the same PyPI package.  \n",
    "\n",
    ":::\n",
    "````"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set Configuration and OpenAI API Key\n",
    "\n",
    "By default, in order to use LlamaIndex with Neo4j you need to have an OpenAI key in your environment variable `OPENAI_API_KEY`.\n",
    "\n",
    "You can utilise an OAI_CONFIG_LIST file and extract the OpenAI API key and put it in the environment, as will be shown in the following cell.\n",
    "\n",
    "Alternatively, you can load the environment variable yourself.\n",
    "\n",
    "````{=mdx}\n",
    ":::tip\n",
    "Learn more about configuring LLMs for agents [here](https://docs.ag2.ai/latest/docs/user-guide/basic-concepts/llm-configuration).\n",
    ":::\n",
    "````"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "import autogen\n",
    "\n",
    "llm_config = autogen.LLMConfig.from_json(path=\"OAI_CONFIG_LIST\")\n",
    "\n",
    "# Put the OpenAI API key into the environment\n",
    "os.environ[\"OPENAI_API_KEY\"] = llm_config.config_list[0][\"api_key\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This is needed to allow nested asyncio calls for Neo4j in Jupyter\n",
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Key Information: Using Neo4j with LLM Models 🚀\n",
    "\n",
    "> **Important**  \n",
    "> - **Default Models**:\n",
    ">   - **Question Answering**: OpenAI's `gpt-4.1` with `temperature=0.0`.\n",
    ">   - **Embedding**: OpenAI's `text-embedding-3-small`.\n",
    "> \n",
    "> - **Customization**:\n",
    ">   You can change these defaults by setting the following parameters on the `Neo4jGraphQueryEngine`:\n",
    ">   - `llm`: Specify a LLM instance with a llm you like.\n",
    ">   - `embedding`: Specify a BaseEmbedding instance with a embedding model.\n",
    "> - **Reference**\n",
    ">   - https://docs.llamaindex.ai/en/stable/module_guides/models/llms/\n",
    ">   - https://docs.llamaindex.ai/en/stable/examples/property_graph/graph_store/\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a Knowledge Graph with Your Own Data\n",
    "\n",
    "**Note:** You need to have a Neo4j database running. If you are running one in a Docker container, please ensure your Docker network is setup to allow access to it. \n",
    "\n",
    "In this example, the Neo4j endpoint is set to host=\"bolt://172.17.0.3\" and port=7687, please adjust accordingly. For how to spin up a Neo4j with Docker, you can refer to [this](https://docs.llamaindex.ai/en/stable/examples/property_graph/property_graph_neo4j/#:~:text=stores%2Dneo4j-,Docker%20Setup,%C2%B6,-To%20launch%20Neo4j)\n",
    "\n",
    "We initialise the database with a Word document, creating the Property graph in Neo4j."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# IMPORTS\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "from autogen import ConversableAgent, UserProxyAgent\n",
    "from autogen.agentchat.contrib.graph_rag.neo4j_graph_query_engine import Neo4jGraphQueryEngine"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### A Simple Example\n",
    "\n",
    "In this example, the graph schema is auto-generated. Entities and relationship are created as they fit into the data\n",
    "\n",
    "LlamaIndex supports a lot of extensions including docx, text, pdf, csv, etc. Find more details in Neo4jGraphQueryEngine. You may need to install dependencies for each extension. In this example, we need `pip install docx2txt`\n",
    "\n",
    "We start by creating a Neo4j property graph (knowledge graph) with a sample employee handbook of a fictional company called BUZZ"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# load documents\n",
    "from autogen.agentchat.contrib.graph_rag.document import Document, DocumentType\n",
    "\n",
    "input_path = \"../test/agentchat/contrib/graph_rag/BUZZ_Employee_Handbook.docx\"\n",
    "input_documents = [Document(doctype=DocumentType.TEXT, path_or_url=input_path)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# pip install docx2txt\n",
    "# Auto generate graph schema from unstructured data\n",
    "\n",
    "\n",
    "# Create Neo4jGraphQueryEngine\n",
    "query_engine = Neo4jGraphQueryEngine(\n",
    "    username=\"neo4j\",  # Change if you reset username\n",
    "    password=\"password\",  # Change if you reset password\n",
    "    host=\"bolt://172.17.0.3\",  # Change\n",
    "    port=7687,  # if needed\n",
    "    llm=OpenAI(model=\"gpt-4.1\", temperature=0.0),  # Default, no need to specify\n",
    "    embedding=OpenAIEmbedding(model_name=\"text-embedding-3-small\"),  # except you want to use a different model\n",
    "    database=\"neo4j\",  # Change if you want to store the graphh in your custom database\n",
    ")\n",
    "\n",
    "# Ingest data and create a new property graph\n",
    "query_engine.init_db(input_doc=input_documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead of creating a new property graph, if you want to use an existing graph, you can connect to its database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "from autogen.agentchat.contrib.graph_rag.neo4j_graph_query_engine import Neo4jGraphQueryEngine\n",
    "\n",
    "query_engine = Neo4jGraphQueryEngine(\n",
    "    username=\"neo4j\",  # Change if you reset username\n",
    "    password=\"password\",  # Change if you reset password\n",
    "    host=\"bolt://172.17.0.3\",  # Change\n",
    "    port=7687,  # if needed\n",
    "    llm=OpenAI(model=\"gpt-4.1\", temperature=0.0),  # Default, no need to specify\n",
    "    embedding=OpenAIEmbedding(model_name=\"text-embedding-3-small\"),  # except you want to use a different model\n",
    "    database=\"neo4j\",  # Change if you want to store the graphh in your custom database\n",
    ")\n",
    "\n",
    "# Connect to the existing graph\n",
    "query_engine.connect_db()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An example property graph:\n",
    "\n",
    "![neo4j_property_graph_1.png](https://media.githubusercontent.com/media/ag2ai/ag2/refs/heads/main/notebook/neo4j_property_graph_1.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add capability to a ConversableAgent and query them\n",
    "Notice that we intentionally moved the specific content of Equal Employment Opportunity Policy into a different document to add later"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen.agentchat.contrib.graph_rag.neo4j_graph_rag_capability import Neo4jGraphCapability\n",
    "\n",
    "# Create a ConversableAgent (no LLM configuration)\n",
    "graph_rag_agent = ConversableAgent(\n",
    "    name=\"buzz_agent\",\n",
    "    human_input_mode=\"NEVER\",\n",
    ")\n",
    "\n",
    "# Associate the capability with the agent\n",
    "graph_rag_capability = Neo4jGraphCapability(query_engine)\n",
    "graph_rag_capability.add_to_agent(graph_rag_agent)\n",
    "\n",
    "# Create a user proxy agent to converse with our RAG agent\n",
    "user_proxy = UserProxyAgent(\n",
    "    name=\"user_proxy\",\n",
    "    human_input_mode=\"ALWAYS\",\n",
    ")\n",
    "\n",
    "user_proxy.initiate_chat(graph_rag_agent, message=\"Which company is the employer?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Revisit the example by defining custom entities, relations and schema\n",
    "\n",
    "By providing custom entities, relations and schema, you could guide the engine to create a graph that better extracts the structure within the data.\n",
    "\n",
    "We set `strict=True` to tell the engine to only extracting allowed relationships from the data for each entity\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Literal\n",
    "\n",
    "# best practice to use upper-case\n",
    "entities = Literal[\"EMPLOYEE\", \"EMPLOYER\", \"POLICY\", \"BENEFIT\", \"POSITION\", \"DEPARTMENT\", \"CONTRACT\", \"RESPONSIBILITY\"]\n",
    "relations = Literal[\n",
    "    \"FOLLOWS\",\n",
    "    \"PROVIDES\",\n",
    "    \"APPLIES_TO\",\n",
    "    \"DEFINED_AS\",\n",
    "    \"ASSIGNED_TO\",\n",
    "    \"PART_OF\",\n",
    "    \"MANAGES\",\n",
    "    \"REQUIRES\",\n",
    "    \"ENTITLED_TO\",\n",
    "    \"REPORTS_TO\",\n",
    "]\n",
    "\n",
    "# Define which entities can have which relationships. It can also be used a a guidance if strict is False.\n",
    "schema = {\n",
    "    \"EMPLOYEE\": [\"FOLLOWS\", \"APPLIES_TO\", \"ASSIGNED_TO\", \"ENTITLED_TO\", \"REPORTS_TO\"],\n",
    "    \"EMPLOYER\": [\"PROVIDES\", \"DEFINED_AS\", \"MANAGES\", \"REQUIRES\"],\n",
    "    \"POLICY\": [\"APPLIES_TO\", \"DEFINED_AS\", \"REQUIRES\"],\n",
    "    \"BENEFIT\": [\"PROVIDES\", \"ENTITLED_TO\"],\n",
    "    \"POSITION\": [\"DEFINED_AS\", \"PART_OF\", \"ASSIGNED_TO\"],\n",
    "    \"DEPARTMENT\": [\"PART_OF\", \"MANAGES\", \"REQUIRES\"],\n",
    "    \"CONTRACT\": [\"PROVIDES\", \"REQUIRES\", \"APPLIES_TO\"],\n",
    "    \"RESPONSIBILITY\": [\"ASSIGNED_TO\", \"REQUIRES\", \"DEFINED_AS\"],\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create the query engine and load the document"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create Neo4jGraphQueryEngine\n",
    "query_engine = Neo4jGraphQueryEngine(\n",
    "    username=\"neo4j\",  # Change these as needed\n",
    "    password=\"neo4jneo4j\",\n",
    "    host=\"bolt://192.168.0.115\",\n",
    "    port=7687,\n",
    "    database=\"neo4j\",\n",
    "    llm=OpenAI(model=\"gpt-4.1\", temperature=0.0),\n",
    "    embedding=OpenAIEmbedding(model_name=\"text-embedding-3-small\"),\n",
    "    entities=entities,  # possible entities\n",
    "    relations=relations,  # possible relations\n",
    "    schema=schema,\n",
    "    strict=True,  # enforce the extracted relationships to be in the schema\n",
    ")\n",
    "\n",
    "# Ingest data and initialize the database\n",
    "query_engine.init_db(input_doc=input_documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Property graph screenshot is shown below:\n",
    "\n",
    "![neo4j_property_graph_2.png](https://media.githubusercontent.com/media/ag2ai/ag2/refs/heads/main/notebook/neo4j_property_graph_2.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add capability to a ConversableAgent and query them again\n",
    "You should find the answers are much more detailed and accurate since our schema fits well"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen import ConversableAgent, UserProxyAgent\n",
    "from autogen.agentchat.contrib.graph_rag.neo4j_graph_rag_capability import Neo4jGraphCapability\n",
    "\n",
    "# Create a ConversableAgent (no LLM configuration)\n",
    "graph_rag_agent = ConversableAgent(\n",
    "    name=\"rag_agent\",\n",
    "    human_input_mode=\"NEVER\",\n",
    ")\n",
    "\n",
    "# Associate the capability with the agent\n",
    "graph_rag_capability = Neo4jGraphCapability(query_engine)\n",
    "graph_rag_capability.add_to_agent(graph_rag_agent)\n",
    "\n",
    "# Create a user proxy agent to converse with our RAG agent\n",
    "user_proxy = UserProxyAgent(name=\"user_proxy\", human_input_mode=\"ALWAYS\", code_execution_config=False)\n",
    "\n",
    "user_proxy.initiate_chat(graph_rag_agent, message=\"Which company is the employer?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Incrementally add new documents to the existing knoweledge graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_path = \"../test/agentchat/contrib/graph_rag/BUZZ_Equal-Employment-Opportunity-Policy-Detailed.docx\"\n",
    "input_documents = [Document(doctype=DocumentType.TEXT, path_or_url=input_path)]\n",
    "\n",
    "_ = query_engine.add_records(input_documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Checking the property graph, we'll find a different Equal Employment Opportunity Policy Node\n",
    "\n",
    "![neo4j_property_graph_3.png](https://media.githubusercontent.com/media/ag2ai/ag2/refs/heads/main/notebook/neo4j_property_graph_3.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Now let's create a new GraphRag agent and some questions related to both documents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autogen.agentchat.contrib.graph_rag.neo4j_graph_rag_capability import Neo4jGraphCapability\n",
    "\n",
    "# Ask questions about both documents\n",
    "user_proxy.initiate_chat(graph_rag_agent, message=\"What is Equal Employment Opportunity Policy at BUZZ?\")"
   ]
  }
 ],
 "metadata": {
  "front_matter": {
   "description": "Neo4j GraphRAG utilises a knowledge graph and can be added as a capability to agents.",
   "tags": [
    "RAG"
   ]
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
